29 research outputs found

    Advancing Distributed Data Management for the HydroShare Hydrologic Information System

    Get PDF
    HydroShare (https://www.hydroshare.org) is an online collaborative system to support the open sharing of hydrologic data, analytical tools, and computer models. Hydrologic data and models are often large, extending to multi-gigabyte or terabyte scale, and as a result, the scalability of centralized data management poses challenges for a system such as HydroShare. A distributed data management framework that enables distributed physical data storage and management in multiple locations thus becomes a necessity. We use the iRODS (Integrated Rule-Oriented Data System) data grid middleware as the distributed data storage and management back end in HydroShare. iRODS provides a unified virtual file system for distributed physical storages in multiple locations and enables data federation across geographically dispersed institutions around the world. In this paper, we describe the iRODS-based distributed data management approaches implemented in HydroShare to provide a practical demonstration of a production system for supporting big data in the environmental sciences

    A Resource Centric Approach For Advancing Collaboration Through Hydrologic Data And Model Sharing

    Full text link
    HydroShare is an online, collaborative system being developed for open sharing of hydrologic data and models. The goal of HydroShare is to enable scientists to easily discover and access hydrologic data and models, retrieve them to their desktop or perform analyses in a distributed computing environment that may include grid, cloud or high performance computing model instances as necessary. Scientists may also publish outcomes (data, results or models) into HydroShare, using the system as a collaboration platform for sharing data, models and analyses. HydroShare is expanding the data sharing capability of the CUAHSI Hydrologic Information System by broadening the classes of data accommodated, creating new capability to share models and model components, and taking advantage of emerging social media functionality to enhance information about and collaboration around hydrologic data and models. One of the fundamental concepts in HydroShare is that of a Resource. All content is represented using a Resource Data Model that separates system and science metadata and has elements common to all resources as well as elements specific to the types of resources HydroShare will support. These will include different data types used in the hydrology community and models and workflows that require metadata on execution functionality. The HydroShare web interface and social media functions are being developed using the Drupal content management system. A geospatial visualization and analysis component enables searching, visualizing, and analyzing geographic datasets. The integrated Rule-Oriented Data System (iRODS) is being used to manage federated data content and perform rule-based background actions on data and model resources, including parsing to generate metadata catalog information and the execution of models and workflows. This presentation will introduce the HydroShare functionality developed to date, describe key elements of the Resource Data Model and outline the roadmap for future development

    HydroShare – A Case Study of the Application of Modern Software Engineering to a Large Distributed Federally-Funded Scientific Software Development Project

    Get PDF
    HydroShare is an online collaborative system under development to support the open sharing of hydrologic data, analytical tools, and computer models. With HydroShare, scientists can easily discover, access, and analyze hydrologic data and thereby enhance the production and reproducibility of hydrologic scientific results. HydroShare also takes advantage of emerging social media functionality to enable users to enhance information about and collaboration around hydrologic data and models. HydroShare is being developed by an interdisciplinary collaborative team of domain scientists, university software developers, and professional software engineers from ten institutions located across the United States. While the combination of non–co-located, diverse stakeholders presents communication and management challenges, the interdisciplinary nature of the team is integral to the project’s goal of improving scientific software development and capabilities in academia. This chapter describes the challenges faced and lessons learned with the development of HydroShare, as well as the approach to software development that the HydroShare team adopted on the basis of the lessons learned. The chapter closes with recommendations for the application of modern software engineering techniques to large, collaborative, scientific software development projects, similar to the National Science Foundation (NSF)–funded HydroShare, in order to promote the successful application of the approach described herein by other teams for other projects

    Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)

    Get PDF
    This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including developing pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group's future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happen

    An Architectural Overview Of HydroShare, A Next-Generation Hydrologic Information System

    Full text link
    HydroShare is an online, open-source, collaborative system being developed for sharing hydrologic data and models as part of the NSF’s Software Infrastructure for Sustained Innovation (SI2) program. The goal of HydroShare is to enable scientists to easily discover and access hydrologic data and models, retrieve them to their desktop, or perform analyses in a distributed computing environment that may include grid, cloud, or high performance computing. Scientists may also publish outcomes (data, results or models) into HydroShare, using the system as a collaboration platform for sharing data, models, and analyses. HydroShare involves a large distributed software development effort requiring collaboration between domain scientists, software engineers, and software developers across eight U.S. universities, RENCI, and CUAHSI. HydroShare expands the data sharing capabilities of the Hydrologic Information System of the Consortium of Universities for the Advancement of Hydrologic Sciences, Inc. (CUAHSI): It broadens the classes of data accommodated, enables sharing of models and model components, and leverages social media functionality to enhance collaboration around hydrologic data and models. The HydroShare architecture is a stack of storage and computation, web services, and user applications. A content management system, Django+Mezzanine, provides user interface, search, social media functions, and services. A geospatial visualization and analysis component enables searching, visualizing, and analyzing geographic datasets. The integrated Rule-Oriented Data System (iRODS) is used to manage federated data content and perform rule-based background actions on data and model resources, including parsing to generate metadata catalog information and the distributed execution of models and workflows. A web browser is the main interface to HydroShare, however a web services applications programming interface (API) supports access through HydroDesktop and other hydrologic modeling systems, and the architecture separates the interface layer and services layer exposing all functionality through these web services. This presentation will describe key components of HydroShare and discuss how HydroShare is designedto enable better hydrologic science concomitant with sustainable open-source software practices

    Fourth Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4)

    Get PDF
    This report records and discusses the Fourth Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4). The report includes a description of the keynote presentation of the workshop, the mission and vision statements that were drafted at the workshop and finalized shortly after it, a set of idea papers, position papers, experience papers, demos, and lightning talks, and a panel discussion. The main part of the report covers the set of working groups that formed during the meeting, and for each, discusses the participants, the objective and goal, and how the objective can be reached, along with contact information for readers who may want to join the group. Finally, we present results from a survey of the workshop attendees

    Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)

    Get PDF
    This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including developing pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group’s future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happe

    Report on the 3rd Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)

    Get PDF
    This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including developing pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group's future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happen

    Boundary Spanning, Data and Software Curation, and Cyberinfrastructure Deployment: Dimensions for Developing Science Gateways

    No full text
    <div>There are many incentives to repurpose science cyberinfrastructure capabilities developed for one domain application for use in another domain. These incentives include funder programmatic goals, potential efficiencies through reuse, generative value creating new science, and supporting scientific reproducibility. </div><div><br></div><div>Whether you call it the process crossing the ‘valley of death’, tech infusion, or developing a science gateway, challenges bringing data science, computer science, and domain science experts together to create a domain specific science platform based on existing capabilities can be a frustrating and laborious process with no guarantee of success. These challenges may be fundamental as communicating and translating a concept from one domain to another, or finding a common basis of understanding to transfer domain knowledge and technical knowledge between the two communities, or instilling leading edge curation and development practices. </div><div><br></div><div>This poster outlines the approach that the Renaissance Computing Institute has developed to address these aspects of these challenges. RENCI uses both technical side and the knowledge side to bridge these gaps. More specifically, RENCI’s hybrid approach includes, technology hardening and refactoring, an interoperable technology stack driven by rules and policies, and a focus on boundary objects and boundary, spanning project expertise. RENCI’s experience developing science cyberinfrastructure has contributed to the development of our approach. </div><div><br></div><div>Our poster will describe how these methods have been developed and deployed in the context of a number of RENCI projects including iRODS (Integrated Rule Oriented Data System, {xDCI}Share (Cross-domain cyberinfrastructure), and Risk Analytics Discovery Environment (RADE). In addition to describing the application and methods, we will also share some lessons learned along the way in the context of RENCI projects. </div

    Developing Scientific Software through the Open Community Engagement Process

    No full text
    <p>Today's research relies on trustworthy software. Many scientists develop their own software; however, the quality of academia-developed software tends to be lower than commercially-developed software because, in academia, there are barriers to using proven software engineering methods. To help overcome these barriers, the Water Science Software Institute (WSSI) has developed a model--the Open Community Engagement Process (OCEP)--which brings software engineers and scientists together to traverse a four-step, iterative process that incorporates Agile development principles and open source mechanics. As a part of OCEP, WSSI has engaged a water science community who uses a scientist-developed, computational modeling framework originally developed in the early 1990s. Efforts to improve this software have included two hackathons. This paper compares these hackathons to determine factors that influnenced hackathon outcomes. Thorough planning with sufficient lead time before a hackathon, clarification of expectations, sufficient time for discussion of objectives and clarification of domain vocabulary, and co-location of participants were identified as key factors contributing to hackathon success.</p
    corecore